Binary Neural Networks Algorithms, Architectures, and Applications (Baochang Zhang, Sheng Xu, Mingbao Lin etc.)

Binary Neural Architecture Search

4.1

Background

Deep convolutional neural networks (DCNNs) have dominated as the best performers on

various computer vision tasks such as image classiﬁcation [84], instance segmentation [163],

and object detection [220] due to the great success of deep network architecture design.

With the increasing demand for architecture engineering, instead of designing complex

architectures manually, neural architecture search (NAS) is among the best approaches for

many tasks by generating delicate neural architectures.

Thanks to the rapid development of deep learning, signiﬁcant gains in performance have

been realized in a wide range of computer vision tasks, most of which are manual-designed

network architectures [123, 211, 84, 92]. The neural architecture search (NAS) approach

has recently attracted increased attention. The goal is to ﬁnd automatic ways to design

neural architectures to replace conventional hand-crafted ones. Existing NAS approaches

need to explore a huge search space and can be roughly divided into three approaches:

evolution-based, reinforcement-learning-based, and one-shot-based.

To implement the architecture search within a short period, researchers try to reduce

the cost of evaluating each searched candidate. Early eﬀorts include sharing weights be-

tween searched and newly generated networks [27]. Later, this method was generalized to

a more elegant framework called one-shot architecture search [20, 28, 151, 188, 254]. In

these approaches, an over-parameterized network or super-network covering all candidate

operations is trained only once, and the ﬁnal architecture is obtained by sampling from this

super-network. For example, [20] trained the overparameterized network using a Hyper-

Net [81], and [188] proposed to share parameters among Child models to avoid retraining

each candidate from scratch. DARTS [151] introduces a diﬀerentiable framework and thus

combines the search and evaluation stages into one. Despite its simplicity, researchers have

found some drawbacks and proposed improved approaches over DARTS [254, 39]. PDARTS

[39] presents an eﬃcient algorithm that allows the depth of searched architectures to grow

gradually during the training procedure, signiﬁcantly reducing search time. ProxylessNAS

[29] adopted the diﬀerentiable framework and proposed to search architectures on the target

task instead of adopting the conventional proxy-based framework.

Binary neural architecture search replaces the real-valued weights and activations with

binarized ones, which consumes much less memory and computational resources to search

binary networks and provides a more promising way to eﬃciently ﬁnd network architec-

tures. These methods can be categorized into direct binary architecture search and auxiliary

binary architecture search. Direct binary architecture search yields binary architectures di-

rectly from well-designed binary search spaces. As the ﬁrst art in this ﬁeld, BNAS1 [36]

eﬀectively reduces search time by channel sampling and search space pruning in the early

training stages for a diﬀerentiable NAS. BNAS2 [114] utilizes diversity in the early search

to learn better performing binary architectures. BMES [189] learns an eﬃcient binary Mo-

bileNet [90] architecture through evolution-based search. However, the accuracy of the direct

DOI: 10.1201/9781003376132-4